Tân An
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- (18 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Data Science (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- (18 more...)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
- Asia > Vietnam > Long An Province > Tân An (0.04)
- Asia > Indonesia > Bali (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
On Softmax Direct Preference Optimization for Recommendation
Recommender systems aim to predict personalized rankings based on user preference data. With the rise of Language Models (LMs), LM-based recommenders have been widely explored due to their extensive world knowledge and powerful reasoning abilities. Most of the LM-based recommenders convert historical interactions into language prompts, pairing with a positive item as the target response and fine-tuning LM with a language modeling loss. However, the current objective fails to fully leverage preference data and is not optimized for personalized ranking tasks, which hinders the performance of LM-based recommenders. Inspired by the current advancement of Direct Preference Optimization (DPO) in human preference alignment and the success of softmax loss in recommendations, we propose Softmax-DPO (\textbf{S-DPO}) to instill ranking information into the LM to help LM-based recommenders distinguish preferred items from negatives, rather than solely focusing on positives. Specifically, we incorporate multiple negatives in user preference data and devise an alternative version of DPO loss tailored for LM-based recommenders, which is extended from the traditional full-ranking Plackett-Luce (PL) model to partial rankings and connected to softmax sampling strategies. Theoretically, we bridge S-DPO with the softmax loss over negative sampling and find that it has an inherent benefit of mining hard negatives, which assures its exceptional capabilities in recommendation tasks. Empirically, extensive experiments conducted on three real-world datasets demonstrate the superiority of S-DPO to effectively model user preference and further boost recommendation performance while providing better rewards for preferred items.
Massive newborn star is firing two plasma jets at once
Breakthroughs, discoveries, and DIY tips sent every weekday. A newborn star 15,000 light-years from Earth is fascinating astronomers with its dual blasts of superheated plasma jets . The rare sight captured in stunning detail by the James Webb Space Telescope (JWST) isn't only a display of cosmic forces. It's helping solve a decades' long debate about the origins of massive stellar objects. Located at the edge of the Milky Way galaxy inside a nebula known as Sharpless 2-284 (Sh2-284), the young protostar is already upwards of 10 times the mass of our sun .
- Asia > Vietnam > Long An Province > Tân An (0.05)
- Asia > Japan (0.05)
GFlowGR: Fine-tuning Generative Recommendation Frameworks with Generative Flow Networks
Wang, Yejing, Zhou, Shengyu, Lu, Jinyu, Liu, Qidong, Li, Xinhang, Zhang, Wenlin, Li, Feng, Wang, Pengjie, Xu, Jian, Zheng, Bo, Zhao, Xiangyu
Generative recommendations (GR), which usually include item tokenizers and generative Large Language Models (LLMs), have demonstrated remarkable success across a wide range of scenarios. The majority of existing research efforts primarily concentrate on developing powerful item tokenizers or advancing LLM decoding strategies to attain superior performance. However, the critical fine-tuning step in GR frameworks, which is essential for adapting LLMs to recommendation data, remains largely unexplored. Current approaches predominantly rely on either the next-token prediction loss of supervised fine-tuning (SFT) or recommendationspecific direct preference optimization (DPO) strategies. Both methods ignore the exploration of possible positive unobserved samples, which is commonly referred to as the exposure bias problem. To mitigate this problem, this paper treats the GR as a multi-step generation task and constructs a GFlowNets-based fine-tuning framework (GFlowGR). The proposed framework integrates collaborative knowledge from traditional recommender systems to create an adaptive trajectory sampler and a comprehensive reward model. Leveraging the diverse generation property of GFlowNets, along with sampling and heuristic weighting techniques, GFlowGR emerges as a promising approach to mitigate the exposure bias problem. Extensive empirical results on two real-world datasets and with two different GR backbones highlight the effectiveness and robustness of GFlowGR.
- Asia > Middle East > UAE > Dubai Emirate > Dubai (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Vietnam > Long An Province > Tân An (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > Singapore (0.29)
- Europe > Austria > Vienna (0.14)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- (7 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government (0.92)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Asia > China (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (5 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Eq.Bot: Enhance Robotic Manipulation Learning via Group Equivariant Canonicalization
Deng, Jian, Wang, Yuandong, Zhu, Yangfu, Feng, Tao, Wo, Tianyu, Shao, Zhenzhou
Robotic manipulation systems are increasingly deployed across diverse domains. Y et existing multi-modal learning frameworks lack inherent guarantees of geometric consistency, struggling to handle spatial transformations such as rotations and translations. While recent works attempt to introduce equivariance through bespoke architectural modifications, these methods suffer from high implementation complexity, computational cost, and poor portability. Inspired by human cognitive processes in spatial reasoning, we propose Eq.Bot, a universal canonicalization framework grounded in SE(2) group eq uivariant theory for robot ic manipulation learning. Our framework transforms observations into a canonical space, applies an existing policy, and maps the resulting actions back to the original space. As a model-agnostic solution, Eq.Bot aims to endow models with spatial equivariance without requiring architectural modifications. Extensive experiments demonstrate the superiority of Eq.Bot under both CNN-based (e.g., CLI-Port) and Transformer-based (e.g., OpenVLA-OFT) architectures over existing methods on various robotic manipulation tasks, where the most significant improvement can reach 50.0%.
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Vietnam > Long An Province > Tân An (0.04)
- (2 more...)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Vietnam > Long An Province > Tân An (0.04)